A Dictionary-Based Compressed Pattern Matching Algorithm

نویسندگان

  • Meng-Hang Ho
  • Hsu-Chun Yen
چکیده

Compressed pattern matching refers to the process of, given a text in a compressed form and a pattern, finding all the occurrences of the pattern in the text without decompression. To utilize bandwidth more effectively in the Internet environment, it is highly desirable that data be kept and sent over the Internet in the compressed form. In order to support information retrieval for compressed data, compressed pattern matching has been gaining increasing attention from both theoretical and practical viewpoints. In this article, we design and implement a dictionary-based compressed pattern matching algorithm. Our algorithm takes advantage of the dictionary structure common in the LZ78 family. With the help of a slightly modified dictionary structure, we are able to do ‘block decompression’ (a key in many existing compressed pattern matching schemes) as well as pattern matching on-the-fly, resulting in performance improvement as our experimental results indicate.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Unifying Framework for Compressed Pattern Matching

We introduce a general framework which is suitable to capture an essence of compressed pattern matching according to various dictionary based compressions. The goal is to find all occurrences of a pattern in a text without decompression, which is one of the most active topics in string matching. Our framework includes such compression methods as Lempel-Ziv family, (LZ77, LZSS, LZ78, LZW), byte-...

متن کامل

A Boyer-Moore Type Algorithm for Compressed Pattern Matching

We apply the Boyer–Moore technique to compressed pattern matching for text string described in terms of collage system, which is a formal framework that captures various dictionary-based compression methods. For a subclass of collage systems that contain no truncation, our new algorithm runs in O(‖D‖ + n · m + m + r) time using O(‖D‖ + m) space, where ‖D‖ is the size of dictionary D, n is the c...

متن کامل

Multiple Pattern Matching Algorithms on Collage System

Compressed pattern matching is one of the most active topics in string matching. The goal is to find all occurrences of a pattern in a compressed text without decompression. Various algorithms have been proposed depending on underlying compression methods in the last decade. Although some algorithms for multipattern searching on compressed text were also presented very recently, all of them are...

متن کامل

Collage system: a unifying framework for compressed pattern matching

We introduce a general framework which is suitable to capture the essence of compressed pattern matching according to various dictionary-based compressions. It is a formal system to represent a string by a pair of dictionary D and sequence S of phrases in D. The basic operations are concatenation, truncation, and repetition. We also propose a compressed pattern matching algorithm for the framew...

متن کامل

JPEG-LS Based Two-Dimensional Compressed Pattern Matching

With the phenomenal advances in data acquisition techniques via satellites and in medical diagnostics and forensic sciences, we have encountered a massive growth of image data. On account of efficiency (in terms of both space and time), there is a need to keep the data in compressed form for as much as possible, even when it is being searched. The class of images we are concerned in this paper ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002